CAGEF_services_slide.png

Introduction to Python for Data Science

Lecture 04: Flow Control


0.1.0 About Introduction to Python

Introduction to Python is brought to you by the Centre for the Analysis of Genome Evolution & Function (CAGEF) bioinformatics training initiative. This course was developed based on feedback on the needs and interests of the Department of Cell & Systems Biology and the Department of Ecology and Evolutionary Biology.

The structure of this course is a code-along style; it is 100% hands on! A few hours prior to each lecture, the materials will be avaialable for download at QUERCUS. The teaching materials will consist of a Jupyter Notebook with concepts, comments, instructions, and blank spaces that you will fill out with Python code along with the instructor. Other teaching materials include an HTML version of the notebook, and datasets to import into Python - when required. This learning approach will allow you to spend the time coding and not taking notes!

As we go along, there will be some in-class challenge questions for you to solve either individually or in cooperation with your peers. Post lecture assessments will also be available (see syllabus for grading scheme and percentages of the final mark).

0.1.1 Where is this course headed?

We'll take a blank slate approach here to Python and assume that you pretty much know nothing about programming. From the beginning of this course to the end, we want to get you from some potential scenarios:

and get you to a point where you can:


0.2.0 Lecture objectives

Welcome to this fourth lecture in a series of six. Today we're going to branch off into the wonderful world of flow control and how you can really make your code work for you.

At the end of this lecture we will aim to have covered the following topics:

  1. What is flow control?
  2. Logical, conditional, and comparison operators
  3. For Loops
  4. Conditional Loops

0.3.0 A legend for text format in Jupyter markdown

grey background - a package, function, code, command or directory. Backticks are also use for in-line code.
italics - an important term or concept or an individual file or folder
bold - heading or a term that is being defined
blue text - named or unnamed hyperlink

... - Within each coding cell this will indicate an area of code that students will need to complete for the code cell to run correctly.

Blue box: A key concept that is being introduced
Yellow box: Risk or caution
Green boxes: Recommended reads and resources to learn Python

0.4.0 Data used in this lesson

Today's datasets will focus on using Python lists and the NumPy package

0.4.1 Dataset 1: subset_taxa_metadata_merged.csv

This is our merged dataset from last week. We'll revisit this to play around with looping throught the two-dimensional DataFrame.


0.5.0 Packages used in this lesson

IPython and InteractiveShell will be access just to set the behaviour we want for iPython so we can see multiple code outputs per code cell.

numpy provides a number of mathematical functions as well as the special data class of arrays which we'll be learning about today.

pandas provides the DataFrame class that allows us to format and play with data in a tabular format.

time provides various time-related functions.

1.0.0 What is flow control and why is it important?

Flow controls are programs that allow us to repeat a task over and over until there are no more iterations to perform or until a condition that we set is not met anymore. This means that, with a few lines of code, you can perform tasks that otherwise require you copying/pasting your code hundreds or thousands of times.

Flow control is one of the most important skills to have in your computer programming toolbox, and all programing languages use them and the logic behind is very similar, though the syntax to write the code differs. Under the hood of Python, its packages, methods and functions all have some form of flow control implemented - especially in the cases where it seems like a single command is accomplishing a lot

Having a good understanding of data subsetting, logical, conditional, and comparison operators, is critical to writing flow control programs. Thus, we will start off this lecture with a recap of some those concepts from previous lecture.

control.flow.jpg

Image from https://codewithlogic.wordpress.com/2013/09/01/python-basics-understanding-the-flow-control-statements/


2.0.0 Logical, conditional, and comparison operators recap

One key aspect about these type of operators is that their output is boolean (True or False), and those outputs can be used to perform a wide range of operations. Here are some comparison operators.

2.1.0 Logical operators

We've already seen the logical operators in previous lectures. They're used to generate logical expressions that we use to filter values or set conditions for further steps. We'll even use these to determine the branching of code (ie Control of flow). Here's a table briefly summarizing these operators:

Operator Description
> Greater than
>= Greater than or equal to
< Less than
<= Less than or equal to
== Equivalent values (but not necessarily equivalent objects in memory
!= Inequality or dissimilar values

These are quite straight-forward to work with for integers or floats.

2.1.1 Logical operators can evaluate strings

The rules for using logical operators on strings is slightly different versus integers. When comparing strings, the following procedure is followed:

  1. Characters are compared by matching indices between strings
  2. When characters are not equivalent, their Unicode value is compared
  3. The character with the lower Unicode value is considered to be smaller
  4. The longer string is considered larger when the character values are equal.

Here's a Unicode table to help us out with our interpretation.

Unicode_Table.jpg

Let's give it a try shall we?


2.1.2 Values being compared must be of the same type

We've seen that we can compare integers with integers and how strings can be compared but we can't simply compare dissimilar object types. So no comparing apples to sheep - they just don't stack up.


2.1.3 Objects must be the same size for comparison

Remember that objects must also be of the same size to complete a comparison using Python's built-in operators. Furthermore, logical comparison between list objects can be complicated. Comparison of lists uses lexicographical order by comparing elements at each index, beginning with index = 0. As elements are compared, they must also follow the previous rules we've outlined.

Read more: If you want to learn more about comparing lists in Python, information can be found here.

If you want to retrieve the results of a proper element-wise comparison you'll have to use something like the Numpy package.


2.2.0 Boolean operators

The boolean operators are used for combining True and False values that can come in various formats. We've already come across some examples last lecture when we were filtering our data. Boolean operators can be used to combine logical expressions, variables, or both. We have four operators at our disposal to combine or compare boolean (logical) and non-boolean (bitwise) values.

Operator Description Evaluation rules
and Logical AND results in True only when all comparisons are True True and True = True
True and False = False
False and False = False
& Bitwise AND compares the binary values of an integer at every bit 1010 1010 & 0101 0101 = 0000 0000
or Logical OR results in False only when all comparisons are False True or True = True
True or False = True
False or False = False
| Bitwise OR compares the binary values of an integer at every bit 1010 1010 | 0101 0101 = 1111 1111

Recall that integers can be convered to booleans with any value other than 0 being considered as True. So we can also use bitwise comparison on these booleans although and and or are more appropriate.

Note also that the logical operators (<, >, ==, etc.) take a higher order precedence than and and not when being evaluated within an expression. Conversely bitwise & and | take higher precedence than the logical operators so appropriate use of parentheses () will be required.


2.2.1 Use the logical not to negate your boolean values

The final operator we'll review is the logical NOT. This is a unary operator that can evaluate a single input and returns the opposite boolean value to that input. This can be used to negate the boolean evaluation from a logical expression. This can be especially useful when generating conditional statements that will determine which parts of your code are run (ie control of flow).


2.2.2 The logical not can be used on non-boolean objects

Beware: the logical not does not apply across multiple objects.

However, a quick and easy way to determine the status of an object is to use the logical not. It will return False unless the object is empty. This can also be a very useful way to determine of a variable has been assigned to a proper object.

Here ends the recap on logical operators. Time to loop!


3.0.0 for loops

for loops allow you to iteratively perform operations or data manipulation. Their general structure is

for item in iterable:
     statement

In the above general structure:

In plain English, it means something like "for every item in iterable, do statement until you reach the last element in iterable.

The last thing to note is the indentation on this for loop. Up until now, we have not really been using any tabbed indentation style in our coding. Normally we use tabbed indentation to help make our code more readable ie by indenting the statements inside a for loop.

Python takes this philosophy to the next step by requiring indentation-as-grammer of statements within a control flow structure to be considered part of that structure. We'll see what that means in upcoming examples.

Here are a some definitions of concepts that we will be using today:

Iteration: the repetition of a sequence of computer instructions a specified number of times or until a condition is met1. Iterator: An iterator is an object that contains a countable number of values that can be iterated upon, meaning that you can traverse through all the values2. Iterable: Is an object that has an __iter__ method which returns an iterator. 1https://www.merriam-webster.com/dictionary, 2https://www.w3schools.com/python/python_iterators.asp

3.1.0 Looping over data structures

There are several structures that are iterable, including native structures such as lists, tuples, dictionaries, sets, and the non-native structures such as multidimensional Numpy Arrays and Pandas DataFrames. They are all iterable objects. They are iterable containers because you can retrieve iterators from them (https://www.w3schools.com/python/python_iterators.asp).

3.1.1 Iterators

The job of iterators is to create a "count" or "index" of the elements over which you want to iterate, thus creating a road map to loop over an iterable (a data structure). There are several functions that are meant to be iterators or that can also work as iterators, and which one to use varies with the program that you want to write and what iterator you are working with. Here are some of the most common iterators in Python:

3 A callable is an object allows you to use round parenthesis ( ). 4 A sentinel value is a condition that indicates the termination of a recursive algorithm.


3.2.0 Iterating through lists

The built-in list structure represents a mutable structure that you will likely work with often to iterate through. Let's iterate through some examples.


3.2.1 Use loops to increment a counter

Sometimes you might want to count the number of iterations that occur within a loop. This may be part of some branching code that we'll look at later. With a simple version of such code you can answer, for example, "how many odd integers are in my list?.

Let's see how a loop can be used to increment a variable's value.


Replace 1 by 30, and item by count in the print function, and run the code again. What do you think this code is doing?


3.2.2 Sum the elements of a list with a for loop

Cumulative summation can be done through for loops although we also have the sum() function to accomplish that. Do you wonder how the sum() function actually works?


3.2.3 Subset using the assigned iterable variable in your loop

Now, lets try to subset using the iterable variable in our for loop. Remember that values will be assigned to an item variable from our iterable with each passing loop.


Python is not happy about it... It says that needs to be integers or slices, so let's try with integers to see how it behaves. Recall what we know about lists. Can we subset or slice a list using string values?


3.2.4 Iterators help to traverse your iterable

Look at what happened above. We ended up accessing and element outside the range of the indices in our list. Even though we had a sequence set of integers, we started at 1 and lists are zero-indexed. Simply using the values from the list isn't the correct way to iterate through it either. What we really want is a list of integer values starting at 0 and going to the length of our list.

Here is where iterators play an important role. Before jumping into iterators, let's see what happens when we pass photosynthesis_type to the len() function.


3.2.5 Use the range() function as a for loop interator

Okay, we've looked at a lot of ways on how not to make a for loop. Remember, the for loop by itself does not know what to do with a single integer (the output of len()). Instead, let's use the range() function. Recall that the default behaviour when unary input is provided, is to calculate (0, stop] where we use ( to denote inclusivity and ] to denote exclusivity.

The range() function, of course, returns an iterable. Let's give it a try!


The above TypeError means that range() has no idea what to do with a list if no integers (as indices) are supplied. What if we combine range() and len()?


Now it works. The loop now has an index to iterate ("from 0 to the last item")

So to summarize, we've used a for loop:

What about other data structures?


3.3.0 Iterating through dictionaries

Recall that dictionaries consist of key:value pairs and that unlike lists, they have no index. Instead, they are accessed by providing a matching key. When we provide a dictionary object to a for loop it will return an iterator to its hash/keys.

Let's revisit our amino acid dictionary from lecture 2.


3.3.1 Iterate through your dictionary values by indexing with the hash

Now that we know we can get the key information in our for loop, we can use that much like we did with our list examples to iterate through the value information stored in the dictionary.


3.3.2 Use the attributes of a dictionary as an iterator

Like the dictionary, you can also iterate through its attributes like the keys and values. Remember that we can use methods from the dictionary object to return this information for us. There are three methods we can use for this purpose: keys(), values(), and items().


Or get the whole dictionary


Each key:value pair is printed as a tuple.


3.3.3 Use the for loop to assign multiple variables from your iterator

Knowing that the item() method returns a tuple object from our dictionary - specifically with two elements, can we take advantage of that information? Rather than index the information from the tuple, let's try to assign multiple variables to the elements from our tuple with the for loop itself.


As you can see, compared to single assignment of item(), this time each pair is returned as two variables: key and value, not as tuples. Subtle differences like this one are commonly found across codes that "seem" to behave in the same way. Be aware of this differences, and also others like the type of object that is generated, as these factors influence how you should write your code.

3.4.0 Looping over one-dimensional Numpy arrays

Numpy arrays are not built-in data structures, therefor special iterators are programmed as part of Numpy to work on arrays. In fact all packages that produce iterable objects should inlude basic methods like __iter__ that Python expects to find when provided to something like a for loop.


3.4.1 Arrays and loops can be combined for broadcasting

Remember that arrays are data structures combined for broadcasting. That means we can do things like multiply across elements, replace or fill multiple elements at once (in the case of DataFrames).

Challenge

Create a for loop that multiplies the first four digits of array_1 by 3. Store each iteration in an object called iteration.


3.4.2 Obtain multiple iterators with the nditer() function

So far we've only been generating a single iterator using the base behaviours of the for loop. However, we can use functions that return multiple iterables to us. In turn that can provide multiple iterators similar in idea to what we saw with Dictionaries.


3.4.3 Use nditer() to help broadcast between dissimilar array sizes

Another feature for nditer() is in how it handles the production of iterators for multidimentional arrays and the idea of broadcasting. Suppose instead of two 1x6 arrays, we array_2 was two-dimensional? With arrays and the right coding, we can broadcast across rows. Just be sure the sizes match properly or you'll receive an error.

Speaking of 2D-arrays, let's take a closer look at iterating through them.


3.4.4 Iterating through Numpy 2D-arrays using nditer()

From our previous example, it looks like iterating through a 2D array seems pretty straight-forward. Iteration over 2D Numpy arrays, however, is slightly more complex than with 1D counterparts if you are re-arranging it on the fly.

"An important thing to be aware of for this iteration is that the order is chosen to match the memory layout of the array instead of using a standard C or Fortran ordering. This is done for access efficiency, reflecting the idea that by default one simply wants to visit each element without concern for a particular ordering. We can see this by iterating over the transpose of our previous array, compared to taking a copy of that transpose in C order." https://docs.scipy.org/doc/numpy-1.13.0/reference/arrays.nditer.html

What does all that mean? In simpler terms the iterator for an array uses the same order as it is stored in memory regardless of the shape the array may be in. Let's see how that plays out in practice


3.4.5 Use the order parameter to override how iterator elements are made in nditer()

See how the process of copying the array has re-arranged it's elements in memory as well?

You don't necessarily want to copy your objects every time you want to move through them after transposing or reshaping them. Instead you should look to the specific parameters of nditer() of which include the order parameter which takes on the values of:

Read more: For more information on this parameter check out the documentation

3.5.0 Looping over Pandas' data frames

Let's import subset_taxa_metadata_merged.csv as data. Before we start looping over the file, let's do another a quick recap on subsetting data frames.


3.5.1 Use for loops to import large datasets in smaller chunks

Assuming that you only need to access parts of a file at once time to gather summary information, you can break down large files that will not fit in memory by importing it in smaller chunks. This saves memory and potentially time as you don't have to wait for the whole file to load. Or if information in the file is treated independently between lines or sections - like large sequencing files, you can work with the data in smaller bites.

Luckily for us, the read_csv() function has a parameter chunksize that we can use to set how many lines we'd like in each chunk. By activating this parameter, the function read_csv() automatically returns an interable object called a TextFileReader.

Reorganize the columns so count will be at index 1


3.5.2 Subsetting DataFrames recap

Recall there are a number of ways to subset a DataFrame object. We'll focus mainly on the multi-indexing methods with include:

The next piece of code is not going to work. Can you tell why?

Select only data where the genus is Streptococcus

Use and on this data frame and see what happens


3.5.3 Use logical_()* functions as element-wise boolean operators

In the above example we were simply trying to combine the boolean outputs between what would normally be two pandas Series objects. However Python could not determine the truth value of this object. Instead, we need to turn to the function logical_and(), logical_or() and logical_not() to accomplish our task. They have the same behaviour as their Python counterparts but are able to properly handle the multi-dimensional data of these objects.

These functions are also distinguished from the bitwise operators & (AND), | (OR) and ~ (NOT) mainly by their order of precedence. Remember that these operators will take precedence of evaluation over the logical operands (<, >, ==, etc.).

The logical_*() functions, however, are given a list of boolean statements over which they will element-wise evaluate the expression and return a result. These functions are part of the Numpy package and were designed to work specifically with ndarray objects. Recall that the Series class inherits its behaviours from ndarray.

Let's see some follow-up examples.

We know there are Lactobacillus OTUs with less than 10 and with more than 300 counts. Why are they not showing up in the output?


Select Lactobacillus with less than 10 counts that do not come from Saliva samples

That is all for the recap on conditional and logical operators. Back to flow control.


3.5.4 The default for loop iterator for DataFrames returns column names

How do we print the first 10 observations from the GENUS column? We already have a number of routes to arrive at this solution but can we accomplish this using a for loop? Let's try the intuitive thing and just provide the DataFrame to the for loop first.


3.5.5 Provide a single DataFrame column to iterate in a for loop

No errors but this is not what we wanted. Can you identify what is missing in the code? Our call managed to unpack all of the column names in data - not even just the first 10. Python has no idea what we are asking for so it defaults to printing the column names of a data frame.

So we definitely didn't provide Python with the code needed to interpret our intent. Would we be better off just providing a single column? Let's try.


3.5.5.1 Combine your code with the notna() method to further filter your for loop iterator values

As you can see, providing a single column generates an iterator through the elements of the series. That works but it doesn't get us the first 10 valid entries from the GENUS column. Instead, we should filter out missing or NaN values with the notna() method.

To avoid NaN in the output, pass notna() to the data subsetting. At this point we have a code that is getting too long to perform the subsetting within the for loop, and we will be better off if we do the subsetting and create a variable before adding it to the for loop.

In addition to the subsetting, sort the data alphabetically in increasing order (from A to Z) after selecting the first 10 values.


The for loop below prints every genus that is not NaN. Now make the loop look better and easier to debug.


3.5.6 Yes you can use a for loop, but should you?

We've been having fun generating some code that let's us iterate through our objects but do remember that the there are built-in functions for calculating the simpler things in our data structures. Sometimes while we need the practice, it just a matter of efficiency - especially with large data sets.

Let's use a for loop to calculate some summary statistics on our filtered subset - just for practice.


3.5.7 Use list comprehension to quickly build/manipulate subsets of data

for loops do not always have to be at beginning of your program. Instead we can build a for loop directly into a calculation if we want to use it to build a quick iterable for us to evaluate. This can take the form of

newlist = [expression for item in iterable if condition] where:

In the following example, we will calculate the standard deviation of bacterial counts by taking the squared root of variance.

$$\sigma = \sqrt{\frac{\sum(x_i - \bar{x})^2}{N}}$$

We'll build up slowly to get a sense of what's happening.


Now let's do something with our generator by using the pow(value, exponent) function to get the square of the difference between the value and the mean.

We now have the top half of our equation - which really takes care of the list comprehension part for us. We didn't need to filter the data but we could have included a condition like sum(pow(value - mean, 2) for value in data_fil['count'] if value > 0) which would alter our sum total (try it for yourself!)

Now we just need to divide by N and calculate the square root.

Use Numpy's np.std() function to corroborate your result


3.5.7.1 Challenge

Create a data frame of counts per microbe. We'll use a combination of filtering and method chaining. At the end we'll incorporate our values with the zip() method which can combine tuples as columns to help us make a dataframe.

Do you recall another way to do this from last lecture?


3.6.0 Nested for looping.

We've already discussed the concept of nested objects: lists, arrays, dictionaries. A nested for loop is a similar idea: having loops running as statements within your loops. There's no real limit to how many for loops you can nest but if you're deeply nesting for loops, there may be better ways to accomplish your goal.

Now that we have the non-missing data in the form of data_fil, let's creates a similar cumulative sum of counts except on a per genus per body site basis. We'll change it up and build our results with a dictionary this time. It's really quite similar to what we had before except all of the results are stored in a single variable.

We can use the sleep() function from the time module to show in real time what the loop does. We'll just do it on a subset or our data though.


3.6.1 Beware the power of the nested for loop

Based on the information we have, it appears that we produced exactly what we wanted: 221 unique genera and 5 unique body sites yielding 1105 total combinations. Not so fast though - do all of these combinations truly exist in our dataset? In fact there are only 533 combinations between these two sets. We can prove this by turning again to the groupby() method.

By using the nested for loop we produced combinations that didn't exist within our dataset. The problem persists because we take the sum([]) of an empty object, which returns 0 as a value. Thus we end up filling all the values of our DataFrame whether or not they originally exist.

Work smarter, not harder: Existing functions should be your go-to option, either built-in or from a library. Those functions are optimized to get the job done very efficiently in terms of time and computational resources. Why reinvent the wheel?

4.0.0 Conditional control flow

We use the term conditionals to denote logical expressions that specifically evaluate to True or False and are used to determine how a program will run. Which set of code will it run next? Will it terminate a loop? This is where we also get the idea of flow control or control of flow.

4.1.0 The if statement executes when the conditional evaluates to True

The purpose of the if control statement is pretty clear. if a condition is met (True), then execute a statement. The following is the general structure of if:

if condition:
    statement

where condition can be a simple logical expression or a complex one involving many of the operators we've already covered. Let's give it a try.


4.1.1 The else statement executes when your if conditional evaluates to False

In the above code we have not set any instruction for when the condition evaluates to False. In that case, the statement line is not evaluated and therefore nothing is printed.

Think of the else statement much like plan B. It allows us to provide a catch-all set of code to run in the case where our conditional has "failed". Let's update our general code structure:

if condition:
    first_statement
else:
    second_statement

Simple, right?

The following code adds a boolean column called abundant to a subset of data called subdata (just for computational efficiency). Every observation (row) where the microbial count is greater than 150 will be classified as "yes" for abundant and "no" otherwise.

We'll also introduce two new concepts:

  1. the DataFrame method itertuples() which returns named tuples of values from our DataFrame. It allows us to iterate over rows as named tuples.
  2. The Unpacking/Packing operator *. Up until now we've used this operator for multiplication and other purposes but when placed directly to the left of an iterable, it can help to unpack the elements for passing on say as arguments for a function. Conversely, we can use it as part of a variable assignment to pack or repack an unspecified number of elements into a list as a single variable.
Read more: You can find out more about the packing and unpacking operator with this great tutorial

Let's practice with unpacking and packing before moving forward shall we?

Okay let's put that else statement to use now.


4.1.2 Use the elif statement if you have multiple conditions to check

Getting the hang of if and else? Next in line is elif. Consider elif an intermediate between if and else. It's literally a portmanteau of else and if which means if you want to check for multiple possible scenarios - usually (but not necessarily) with an order of precedence, then you can use the elif statement to go through that checklist. Let's see how it works.

Based on the microbial count, add a column called treatment that will be either treatment_A, treatment_B, or No action depending on the microbe counts (for the sake of this exercise, let's assume that all microbes have pathogenic potential on humans).

Remember: if a conditional fails, the statement within it will not be executed!


4.2.0 Use while loops when you want to iterate based on a condition

while loops run "while" a condition continues to evaluate to True. At the start of each loop, the condition is re-evaluated before a decision is made. If a for loop and an if statement were to make a weird code-baby, the while loop would be it.

You can use the conditional to iterate in different ways like:

  1. Walking or moving through an iterable
  2. Counting through a specific number of "successful" operations

Let's experiment with how that works shall we?


4.2.1 Make sure your while conditional can evaluate to False

In our first example, there is an eventual end to the list because we are permanently removing items. Therefore the conditional will evaluate to False when the list is empty (ie []).

Our second example, however, requires us to remember to increment our variable z. Since this is quite a simple loop it's not an issue as we always increment the value of z. In other cases with complex branching code with if and/or elif statements you must be careful to check that your conditional will eventually fail.

Let's try another example where we print only those rows from subdata where the conditions is to be either 'Streptococcus' or 'Lactobacillus'

Now we have a list of two: "strept" and "lactobac". We can use a for loop to unlist them, then use Pandas' concat() to join them into single data frame


4.3.0 Advance through an iterator without looping using the next() function

Recall from the lecture 03 appendix that for each iterator, we can use the next() function to retrieve the next item in the queue, recalling it's place in the queue. This continues until the last element is evaluated and then the iterator is empty.

If you try to go past the last element, Python will provide a StopIteration error to let you know you've gone too far.

Let's practice with the next() function.


4.4.0 break, and continue can interrupt loops

Sometimes you may be looping through with a for or while loop when an unexpected condition occurs. Perhaps you wanted to error-proof your code or need to exit a loop based on internal conditions encountered while examining your data. Sometimes you may have a last-ditch conditional to prevent yourself from iterating too many times, or even endlessly.

When you need to explicitly exit a loop, you can use the break command. This will end the loop without further repetition.

Alternatively, you may have a long series of code that you don't want to even bother evaluating with more conditionals (to save on processing power for instance). You can end the current iteration of a loop and begin the next using the continue command.

Let's work through a few examples


5.0.0 Class summary

That's our fourth class on Python! You've made it through and we've learned about a number of logical expression operators and how to apply them in loops and filtering data:

  1. Flow control
  2. Logical, conditional, and comparison operators
  3. For loops
  4. Conditional control flow

5.1.0 Post-lecture assessment (12% of final grade)

Soon after the end of this lecture, a homework assignment will be available for you in DataCamp. Your assignment is to complete chapters 3-5 (Logic, Control Flow and Filter, 1500 possible points; Loops, 1450 possible points; and Case Study, 1200 possible points) from the Intermediate Python course. This is a pass-fail assignment, and in order to pass you need to achieve a least 3112 points (75%) of the total possible points. Note that when you take hints from the DataCamp chapter, it will reduce your total earned points for that chapter.

In order to properly assess your progress on DataCamp, at the end of each chapter, please take a screenshot of the summary. You'll see this under the "Course Outline" menubar seen at the top of the page for each course. It should look something like this where it shows the total points earned in each chapter:

DataCamp.example.png

Submit the file(s) for the homework to the assignment section of Quercus. This allows us to keep track of your progress while also producing a standardized way for you to check on your assignment "grades" throughout the course.

You will have until 13:59 hours on Thursday, July 22nd to submit your assignment just before the start of lecture that week.


6.0.0 Appendix

6.1.0 Resources


6.2.0 Acknowledgements

Revision 1.0.0: materials prepared by Oscar Montoya, M.Sc. Bioinformatician, Education and Outreach, CAGEF.

Revision 1.1.0: edited and prepared for CSB1021H S LEC0140, 06-2021 by Calvin Mok, Ph.D. Bioinformatician, Education and Outreach, CAGEF.


The Center for the Analysis of Genome Evolution and Function (CAGEF)

The Centre for the Analysis of Genome Evolution and Function (CAGEF) at the University of Toronto offers comprehensive experimental design, research, and analysis services in microbiome and metagenomic studies, genomics, proteomics, and bioinformatics.

From targeted DNA amplicon sequencing to transcriptomes, whole genomes, and metagenomes, from protein identification to post-translational modification, CAGEF has the tools and knowledge to support your research. Our state-of-the-art facility and experienced research staff provide a broad range of services, including both standard analyses and techniques developed by our team. In particular, we have special expertise in microbial, plant, and environmental systems.

For more information about us and the services we offer, please visit https://www.cagef.utoronto.ca/.

CAGEF_new.png